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Abstract. We study the inverse problem of estimating a field from data 
comprising a finite set of nonlinear functionals of ^ subject to additive noise; we 
denote this observed data by y. Our interest is in the reconstruction of piecewise 
continuous fields in which the discontinuity set is described by a finite number of 
geometric parameters a. Natural applications include groundwater flow and electrical 
impedance tomography. We take a Bayesian approach, placing a prior distribution 
on and determining the conditional distribution on given the data y. It is 
then natural to study maximum a posterior (MAP) estimators. Recently (Dashti et 
al 2013 Inverse Problems 29 095017) it has been shown that MAP estimators can 
be characterised as minimisers of a generalised Onsager-Machlup functional, in the 
case where the prior measure is a Gaussian random field. We extend this theory 
to a more general class of prior distributions which allows for piecewise continuous 
fields. Specifically, the prior field is assumed to be piecewise Gaussian with random 
interfaces between the different Gaussians defined by a finite number of parameters. 
We also make connections with recent work on MAP estimators for linear problems 
and possibly non-Gaussian priors (Helin, Burger 2015 Inverse Problems 31 085009) 
which employs the notion of Fomin derivative. 

In showing applicability of our theory we focus on the groundwater flow and 
FIT models, though the theory holds more generally. Numerical experiments 
are implemented for the groundwater flow model, demonstrating the feasibility of 
determining MAP estimators for these piecewise continuous models, but also that 
the geometric formulation can lead to multiple nearby (local) MAP estimators. We 
relate these MAP estimators to the behaviour of output from MGMG samples of the 
posterior, obtained using a state-of-the-art function space Metropolis-Bastings method. 


AMS classification scheme numbers: Primary: 62G05, 65N21; Secondary: 49J55 
Keywords: inverse problems, Bayesian approach, geometric priors, MAP estimators, 
EIT, groundwater flow. 

1. Introduction 

1.1. Context and Literature Review 

A common inverse problem is that of estimating an unknown function from noisy 
measurements of a (possibly nonlinear) map applied to the function. Statistical and 
deterministic approaches to this problem have been considered extensively. In this 
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paper we focus on the the study of MAP estimators within the Bayesian approach; these 
estimators provide a natural link between deterministic and statistical methods. In the 
Bayesian formulation, we describe the solution probabilistically and the distribution 
of the unknown, given the measurements and a prior model, is termed the posterior 
distribution. MAP estimators attempt to work with a notion of solutions of maximal 
probability under this posterior distribution and are typically characterised variationally, 
linking to deterministic methods. 

There are two main approaches taken to the study of the posterior. The first is 


to discretise the space, and then apply finite dimensional Bayesian methodology 18 


An advantage to this approach is the availability of a Lebesgue density and a large 
amount of previous work which can then be built upon; but issues may arise (for 
example computationally) when the dimension of the discretisation space is increased. 
An alternative approach is to apply infinite dimensional methodology directly on the 
original space, to derive algorithms, and then discretise to implement. This approach 
has been studied for linear problems in 12,25,27 , and more recently for nonlinear 


problems [^[^[2^|^. It is the latter approach that we focus on in this paper. 

In some situations it may be that point estimates are more desirable, or more 
computationally feasible, than the entire posterior distribution. A detailed study of 
point estimates can be found in for example 24 . Three different estimates are commonly 
considered: the posterior mean which minimises loss, the posterior median which 
minimises loss, and posterior modes which minimise zero-one loss. The former two 
estimates are unique [^, but a distribution may possess more than one mode. A 
consequence of this is that the posterior mean and median may be misleading in the 
case of a multi-modal posterior. Posterior modes are often termed maximum a posteriori 
(MAP) estimators in the literature. 

In this paper we focus on MAP estimation. If the posterior has Lebesgue density p, 
MAP estimators are given by the global maxima of p. The problem of MAP estimation in 
this case is hence a deterministic variational problem, and has been well-studied . In 
the infinite-dimensional setting there is no Lebesgue density, but there has been recent 
research aimed at characterising the mode variationally and linking to the classical 
regularisation techniques described in, for example, in the case when Gaussian priors 
are adopted. Non-Gaussian priors have also been considered in the infinite dimensional 
setting - in weak MAP (wMAP) estimators are defined as generalisations of MAP 
estimators, and a variational characterisation of them is provided in the case that the 
forward map is linear, using the notion of Fomin derivative. 

In this paper we make a significant extension of the work in to include priors 
which are defined by a combination of Gaussian random fields and a finite number of 
geometric parameters which define the different domains in which the different random 
fields apply. We thereby study the reconstruction of piecewise continuous fields with 
interfaces defined by a finite number of parameters. Our motivation for doing so 
comes from the work in j^, and its predecessors. In that paper a Bayesian inverse 
problem for piecewise constant fields, modelling the permeability appearing in a two- 
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Figure 1: An example of construction of a piecewise continuous field, using two 
continuous fields and two scalar parameters. Here the scalar parameters determine 
the points where the interface meets each side of the domain. We work on the space 
of continuous fields and parameters, but it is pushforward of these by the construction 
map that represents the piecewise continuous field we aim to recover. 


phase subsurface flow model, was studied. 


previously studied in a groundwater flow context in 16 


Such piecewise continuous fields were also 
where existence and well- 


posedness of the posterior distribution were shown. The idea of single point estimates 
being misleading is discussed and the existence of multiple local MAP estimators is 
shown. We also link our work to that in 14 , by characterizing the MAP estimator via 
the Fomin derivative. 

Throughout this paper we focus on two model problems: groundwater flow and 
electrical impedance tomography (FIT). Both of these problems are important examples 
of large scale inverse problems, with applications of great economic and societal value. 
MAP estimation in such problems has been studied previously (^[4| [I7 [[m] . However our 
formulation is quite general; for brevity we simply illustrate the theory for groundwater 
flow and FIT, and the numerics only in the case of groundwater flow. 


1.2. Mathematical Setting 

Let A be a separable Banach space and let A C M^. X should be thought of as a 
function space and A a space of geometric parameters. Given (u, a) G A x A, we 
construct another function G Z, say. Considering the ingredients u and o in the 
construction of this function separately will be useful in what follows. An example 
of such a construction is shown in Figure 

Suppose we have a (typically nonlinear) forward operator Q : X x A —>■ T, where 
Y = If (m, a) denotes the true input to our forward problem, we observe data y EY 
given by 

y = g{u,a)Py 

where 77 ~ A(0,r), F G positive definite, is some centred Gaussian noise on Y. 

Modelling everything probabilistically, we build up the joint distribution of (u, a, y) by 
specifying a prior distribution jiQ x vq on (u, a) and an independent noise model on rj. 
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We are then interested in the posterior fi on («, a) given y. Denote | • | the Enclidean 
norm on and for any positive definite A G denote | • |^ := \A~^I‘^ ■ | the weighted 
norm on E'^. Under certain conditions, using a form of Bayes’ theorem, we may write 
y in the form 

fi{du,da) oc exp i^]^\Q{u,a) - |/|r^ fio{du)uo{da). 

The modes of the posterior distribution, termed MAP (maximum a posteriori) 
estimators, can be considered ‘best guesses’ for the state {u, a) given the data y. We now 
state rigorously what we mean by a MAP estimator for y, as in [^. Given {u, a) G A x A, 
denote by B^(u, a) the ball of radius S centred at (u, a). 


Definition 1.1 (MAP estimator). For each 5 > 0, define 
{u^,a^)= argmsbX y{B^{u, a)). 

{u,a)GX X A 

Any point (fi, a) G A x A satisfying 

y{B\u,d)) ^ 

Uo y{B^{u\a^)) 

is called a MAP estimator for the measure y. 


If this definition is applied to probability measures defined via a Lebesgue density, 
MAP estimators coincide with maxima of this density. Here we extend the notion to 
the study of piecewise continuous fields. 


1.3. Our Contribution 


The primary contributions of the paper are fourfold: 


(i) We develop the MAP estimator theory for infinite dimensional geometric inverse 
problems involving discontinuous fields, building on theory in both of the recent 
papers (9,14, and opening up new avenues for the study of MAP estimators in 
infinite dimensional inverse problems. 


(ii) We explicitly link MAP estimation for these geometric inverse problems to a 
variational Onsager-Machlup minimization problem. 

(hi) We show that the theory applies to groundwater flow model as in (^ and we show 
that the theory applies to the EIT problem as in [IT] . 

(iv) We implement numerical experiments for the groundwater flow model and 
demonstrate the feasibility of computing (local) MAP estimators within the 
geometric formulation, but also show that they can lead to multiple nearby 
solutions. We relate these multiple MAP estimators to the behaviour of output 
from MCMC to probe the posterior. 
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l.f- Structure of the Paper 

• In section we describe the forward maps associated with the groundwater flow 
and EIT problems, and show that they have the appropriate regularity needed in 
sections SHSl 

• In section]^ we describe the choice of, and assumptions upon, the prior distribution 
whose samples comprise piecewise Gaussian random flelds with random interfaces. 

• In section we show existence and uniqueness of the posterior distribution. 

• In section we deflne MAP estimators and prove their equivalence to minimisers 
of an appropriate Onsager-Machlup functional. 

• In section we present numerics for the groundwater flow problem. We consider 
three different prior models and investigate maximisers of the posterior distribution. 

• In section we conclude and outline possible future work in the area. 


2. The Forward Problem 


We consider two model problems. Our first problem (groundwater flow) is that 
of determining the piecewise continuous permeability of a medium, given noisy 
measurements of water pressure (or hydraulic head) within it. The second problem 
(EIT) is determination of the piecewise continuous conductivity within a body from 
boundary voltage measurements. 

In what follows, the finite dimensional space A will be a space of geometric 
parameters defining the interfaces between different media, and X will be a product 
of function spaces defining the values of the permeabilities/conductivities between the 
interfaces. 


We begin in subsection 2.1 by defining the construction map (u,a) for the 

piecewise continuous fields. In subsections |2.2| and |2.3| we describe the models for 
groundwater flow and EIT respectively, and prove regularity properties of the resulting 
forward maps; these properties are required for our subsequent theory. 


2.1. Defining the Interfaces 

Let D C be the domain of interest and let A C be the space of geometric 
parameters. Take a collection of set-valued maps Aj : A ^ i = 1,... ,N such 

that for each a G A we have 

N 

[J Ai{a) = D, Ai{a) n Aj{a) = 0 liiyi j. 

i=l 

We assume that each map Aj is continuous in the sense that 
|o — 6| —>• 0 |Aj(o)AAj(6)| —)• 0 

where A denotes the symmetric difference: 

AAB := (A\5) U (5\ A). 
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Let X = C^{D; M.^). Given u = (ui,..., G X and a G A we define the function 
G L°°{D) by 

N 

= F(u, a) := nilAi(a)- (2.1) 

1=1 

where F : X x A ^ L'^{D) is the construction map. 

We give four examples of the functions Ai and the sets/interfaces they define. 

Example 2.1. Let D = [0,1]^, A = [0,1]^ and N = 2. We specify points a and b on 
either side of the square D and join them with a straight line. We then let Ai(a, h) he 
the region of D below this line and ^2(0,6) = D \ Ai{a,b). 



Figure 2: Possible sets Ai corresponding to Example 


2.1 


Example 2.2. Let D = [0,1]^, A = [0,1]^ and N — 2. Choose a continuous map 
H : A ^ L°°([0,1]) such that H{a,b){0) = a and H{a,b){l) = b for all {a,b) G A. Let 
Ai(a,b) be the region of D beneath the graph of the curve H(a,b) and let A 2 {a,b) = 
D\Ai{a,b). This setup includes the previous example: H{a,b){x) — a+{b — a)x defines 
the appropriate straight lines. 

The continuity of Ai and A 2 can be seen by noting that 

|Ai(oi, 6i)AAi(a2,62)1 = 1^2(01,6i)AA2(a2,62)! 

< [ \H{ai,bi){x) - H{a 2 ,b 2 ){x)\dx 
Jo 

< \\H{ai,bi) - H{a2,b2)\U 

and using the continuity of H into Z/°°([0,1]). 

For example, one may take H to be given by 

H{a, b){x) = a + {b — a)x + a:sin(67rx)/10 
which can he seen to be continuous into L°°([0,1]). 

Example 2.3. We can generalise the previous example to allow the inclusion of a fault. 
Let D = [0,1]^? A = [0,1]^ X [“1)1] N = 2. Let p G (0,1) denote the horizontal 

location of the fault. Civen H : [0,1]^ L°°([0,1]) as in the previous example, define 
H :A^ A°°([0,1]) by 


H{a, b, c){x) = 


H{a, h){x) 


X G [0,p] 


c + H{a,b){x) X G (p, 1] 
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Figure 3: Possible sets Ai, corresponding to Example 


2.2 


so that the parameter c determines the (signed) magnitude of the fault. Defining the 
sets Ai(a,b,c) and A 2 {a,b,c) as the regions of D beneath and above the curve H(a,b,c) 
respectively, the continuity can be seen in a similar manner to the previous example. 



Figure 4: Possible sets Ai, corresponding to Example 2.3 in the case p = 1/2. 


Example 2.4. Again working with D — [0,1]^, but with a much larger parameter space, 
one could also select points at specific x-coordinates and linearly interpolate between 
them. Fix K,N eN and set A = C [0, where S^v-i is the simplex 

f-i = {(?/!, ••• ,1/Ar-i) e [0,1]^“^ I 0 < 1/1 < ... < Pn-i < 1}. 


^N- 

Then given a E A, define the functions ffia), i — 1,..., N — 1, to be the linear 
interpolation of the points Afia), i = 1,..., N — 1, is then defined to be the 

region between the graphs of the functions ffia) and fi-i{a), andAj^{a) = 

In order to see the continuity of these maps, we first partition the domain into strips 
Dj, 

Dj = ^{x,y) E D 
so that we have 


j - 1 
K-1 


<x< 


K-1 


j = l,...,K-l 


K-1 

A(o) = [J A{a) n Dj. 

It follows from properties of the symmetrie differenee that 

K-1 

\Afia)AAfib)\ < J] |(4(a) n DfiA{Afih) n Dfi\. 
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It hence suffices to show that the maps Ai{-)r]Dj are continuous for all i,j. This follows 
from the same argument as in Example\2.^ for sufficiently small \a — h\. 



Figure 5: Possible sets Ai, corresponding to Example 2.4 in the case iF = 11, = 6 


2.2. The Darcy Model for Groundwater Elow 


We consider the Darcy model for groundwater flow on a domain D C d = 1,2,3. 
Let K = {Kij) denote the permeability tensor of the medium, p the pressure of the water, 
and assume the viscosity of the water is constant. Darcy’s law tells us that the 
velocity is proportional to the gradient of the pressure: 


V — —nVp. 

Additionally, a local form of mass conservation tells us that 

V • u = /. 


Combining these two equations, and imposing Dirichlet boundary conditions for 
simplicity, results in the PDE 

—V • (kVp) = / in D 

p = g on dD. 

This is the PDE we will consider in the forward model, and it gives rise to a solution 
map K p. 

For simplicity we will work in the case where k is an isotropic (scalar) permeability, 
bounded above and below by positive constants, and so it can be represented as 
the image of some bounded function under a positive continuously differentiable map 
cr : M M+. 

Let V = the Sobolev space of once weakly differentiable functions on D 

Then given / G g G u & X and o G A, define Pu^a G E to be the 

solution of the weak form of the PDE 


13 


-V • (cr(u“)Vp„,a) = / in D 

Pu,a =9 on dD. 

We are first interested in the regularity of the map IZ 


( 2 . 2 ) 

A X A ^ E given by 


IZ{u,a) = Pu^a- We first recall what it means for Pu^a fo be a solution of (2.2). Since 
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g G by the trace theorem 13 there exists G E V such that tr(G) = g. The 

solution pu,a of is then given by Pu,a = qu,a + G, where qu,a G -f^o(-D) solves the 
PDE 

' -V • (a(u“)= / + V • ((t(u“)VG) in D 

qu,a =0 on dD. 

The following lemma tells us that the map IZ is well defined and has certain regularity 
properties. Its proof is given in the appendix. 


(2.3) 


Lemma 2.5. The map IZ : X x A ^ V is well-defined and satisfies: 

(i) for each {u,a) G X x A, 

\\IZ{u,a)\\v < (||/||v* + ||o'(m“)||l°° ||G||i/)/Kmm(^f, tt) + ||G||y 
where K,jnm{u,a) is given by 

K-mm{u,a) = essinf (t(u“(x)) > 0; 

xeD 

(a) for each a E A, TZ{-,a) : X ^ V is locally Lipschitz continuous, i.e. for every 
r > 0 there exists L{r) > 0 such that, for all u,v E X with ||u||x, ll'^'llv < o,nd 
all a E A, we have 

\\TZ{u, a) - TZ{v, a)||y < L{r)\\u - u||x; 

(Hi) for each u E X, IZ{u, •) : A —?• E is continuous. 

We now choose a continuous linear observation operator i : V ^ . For example, 

writing i = (G,, ij), we could take 

^iiP)=f dx, z = l,...,J (2.4) 

Jr, 

for some £ > 0, so that G approximates a point observation at the point Xi E D. Our 
forward operator ^ : X x A ^ is then defined by ^ £ o 7^, so that it can be written 

as the composition 

{u, a) ^ ^ K — a{u°‘) p £{p) 

From the above regularity of TZ we can deduce the following regularity properties 
of our forward operator Q: 

Proposition 2.6. Define the map Q : X x A ^ as above. Then Q satisfies 

(i) For each r > 0 and u,v E X with ||u||js:, there exists C{r) > 0 such that 

for all a E A, 

\Q{u,a) - G{v,a)\ < C{r)\\u - v\\x. 

(a) For each u E X, the map G{u, ■) : A ^ is continuous. 


Proof, (i) The map i is defined to be a continuous linear functional, and so in particular 
is Lipschitz. Since we have G = i oIZ the result follows from Lemma [2.5[ ii). 

(ii) This follows from the continuity of i and Lemma 2.5[ iii) . 

□ 
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Figure 6: An example domain D, with attached electrodes for the FIT problem. 


2.3. The Complete Electrode Model for EIT 


Electrical Impedance Tomography (EIT) is an imaging technique that aims to make 
inference about the internal conductivity of a body from surface voltage measurements. 
Electrodes are attached to the surface of the body, current is injected, and the resulting 
voltages on the electrodes are measured. Applications include both medical imaging, 
where the aim is to non-invasively detect internal abnormalities within a human patient, 
and subsurface imaging, where material properties of the subsurface are differentiated 
via their conductivities. Early references include (l^ in the context of medical imaging 


and 20 in the context of subsurface imaging. 


The complete electrode model (CEM) is proposed for the forward model in 32 


and shown to agree with experimental data up to measurement precision. In its strong 
form, the PDE reads 

—V • [k{x)Vv{x)) = 0 X E D 


'ei 


A Q T 

K— dS = Ii 

on 

.dv 

v{x) + Zin{x)^{x) 


1 = 1,...,L 


X EdD\ ljf=i ei 


= Vi X Eei,l = l, 


(2.5) 


The domain D represents the body, and (ei)fLi C dD the electrodes attached to its 
surface with corresponding contact impedances A current f is injected into each 

electrode e/, and a voltage measurement V/ made. Here n represents the conductivity 
of the body, and v the potential within it. Note that the solution comprises both a 
function v G H^{D) and a vector G of boundary voltage measurements. 

A corresponding weak form exists, and is shown to have a unique solution (up to 
constants) given appropriate conditions on k, (zi)iLi and {Ii)jLi - see 32 for details. 
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Moreover, under some additional assumptions, the mapping k, i-)- is known to 

be Frechet differentiable when we equip the conductivity space with the supremum 
norm [It] . 

We can apply different cnrrent stimnlation patterns to the electrodes to yield 
additional information. Assume that we have M different (linearly independent) current 
stimulation patterns This yields M different mappings n i-)- each 

with the regularity above, or equivalently a mapping k V where V G with 
J = LM. 

Analogously to the Darcy model case, we will consider isotropic conductivities of 
the form k = cr(M“), where a : E ^ E"*" is positive and continuously differentiable. Our 
forward operator Q : X x A ^ E'^, is then given by the composition 

(u,a) ^ ^ K = (j(u“) ^ ^ 

We show in the appendix that the map defined in this way has the same regularity as 
the map corresponding to the Darcy model. 

Proposition 2.7. Define the map Q : X x A as above. Then Q satisfies 

(i) For each r > 0 and u,v E X with ||m||x, II'^'IIx < there exists C{r) > 0 such that 
for all a E A, 

\Q{u,a) - G{v,a)\ < C{r)\\u - v\\x. 

(a) For each u E X, the map Q{u, ■) : A ^ is continuous. 

3. Onsager-Machlup Functionals and Prior Modelling 

In this section we recall the definition of an Onsager-Machlnp functional for a measure 
which is equivalent[|] to a Gaussian measure. We then introduce the prior measures that 
we will consider, first on the fnnction space X, then the geometric parameter space 
A, and finally the product space X x A. We conclude the section by extending the 
definition of Onsager-Machlup functional so that it is appropriate for the measures we 
consider here, supported on fields and geometric parameters which are combined to 
make piecewise continuous functions. 

3.1. Onsager-Machlup Functionals 

The Onsager-Machlup functional of a measure is the negative logarithm of its Lebesgue 
density when snch a density exists, and otherwise can be thonght of analogously. We 
start by defining it precisely for measures defined via density with respect to a Ganssian, 
allowing for infinite dimensional spaces on which Lebesgue measure is not defined. 
Suppose that p is a measure equivalent to a Ganssian measure /iq. Then the Onsager- 
Machlup functional for fi is defined as follows. 

f Two measures p,ij. on a, measurable space (M, At) are equivalent if v{A) = 0 if and only if /u(A) = 0, 
for A e At. 
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Definition 3.1 (Onsager-Machlup functional I). Let fi be a measure on a Banach space 
Z which is equivalent to jiQ, where fiQ is a Gaussian measure on Z with Cameron-Martin 
space E. Let B^{z) denote the ball of radius 5 centred at z & Z. A functional / : Z —)■ M 
is called the Onsager-Machlup functional for fi if for each x,y & E, 


limh^ 

«0 mbHv)) 

and I(x) = oo for x ^ E. 


exp {I{y) - I{x)) 


Remarks 3.2. (i) The Onsager-Machlup functional is only defined up to addition of 

a constant. 


(ii) If Z is finite dimensional and pi admits a positive Lebesgue density p, then 
I{x) = —\ogp{x) for all x £ Z. In light of the previous remark, this is true 
even if p is not normalised. 

(Hi) Let Z = ML be finite dimensional, and let Pq = N{0, S) be a Gaussian measure on 
Z. Let r e be a positive-definite matrix, A e and y e M"*. Define p 

by 

SO that 

^{x) oc exp - y\l - ]^\x\l 

Then by the previous remark, the Onsager-Machlup functional for p is given by 

H^) = - y\r + 

for all X E Z, which is a Tikhonov regularised least squares functional. 



(iv) The preceding example (Hi) may be extended to an infinite dimensional setting. Let 
Z be a separable Banach space, and let Pq = N(0,Co) be a Gaussian measure on Z 
with Cameron-Martin space {E, (•, ■)e, || • Hb). Let F e be a positive-definite 

matrix, A : X —?• E"* a bounded linear operator and y G E"*. Define p by 

Eix)cexp(-\\Ax-y\l 

Then Theorem 3.2 in 0 tells us that the Onsager-Machlup functional for p is 
given by 

I{x) = ^l^a;-|/|^ + ^||a:|||. 



(v) In this paper, the posterior distribution will be a measure on the product space 
Z = X X A. The prior distribution will be an independent product of a Gaussian 
on X and a compactly supported measure on A. Due to the assumption of compact 
support, the prior will not be equivalent to a Gaussian measure on Z and so the 
above definition doesn’t apply; we provide a suitable extension to the definition in 


subsection 3.4 
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As we are taking a Bayesian approach to the inverse problem, we incorporate 
our prior beliefs about the permeability/conductivity into the model via probability 
measures on X and A. We will combine these into a prior measure on the product space 
A X A. We equip this space with any (complete) norm IK-, •)|| such that if ||(«, a)|| 0, 

then ||m||js: ^ 0 and |a| ^ 0. 


3.2. Priors for the Fields 


We wish to put priors on the fields Ui,... ,uj^ G C^{D). We use independent Gaussian 
measures Ui ~ /Tq := N{mi,Ci), where the means rrii G C^{D), and each covariance 
operator G : C^{D) C^{D) is trace-class and positive definite. It follows that the 

vector (ui, ..., uj^) ~ /rj x ... x [Iq =: fiQ is Gaussian on X: 


N 


tio — N i Ci 


i=l 


where m = {mi,..., mpf) G X. If Ei denotes the Cameron-Martin space 10 of /jf, then 
that of /To is given by 

N 


i=l 


with inner product given by the sum of those of its component spaces. 
The Onsager-Machlup functional of fiQ is known to be given by 


J{u) = 


|||u — m 


II; u — m E E 


oo u — m ^ E. 

This can be seen, for example, as a consequence of Proposition 18.3 in p^. 


Remark 3.3. We may assume that the different fields are correlated under the prior, 
so long as /tq remains Gaussian on X - this does not affect any of the following theory. 
Allowing correlations between the fields and the geometric parameters under the prior is 
a more technical issue however, and so we will assume that these are independent. 


Example 3.4. Define the negative Laplacian with Neumann boundary conditions as 
follows: 


A = -A, V(A) = <^ u G H^(D) 


= 0 on dD, [ 
di^ Jd 


u{x) dx = 0 


Then A is invertible. We can define G = A“"b where each ai > d/2. Then each 
Ci is trace-class and positive definite, and samples from each pf will be almost surely 
continuous and so /tq can be considered as a Gaussian measure on X. Moreover, 
regularity of the samples will increase as ai increases, see fWj for details. 
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We also want to put a prior measure on the geometric parameters, i.e. we want to 
choose a probability measure on A. Since A C the analysis is more straightforward 
than the infinite dimensional case. Let i/ be a probability measure on A with compact 
support S' C A. We assume n is absolutely continuous with respect to the Lebesgue 
measure and that its density p is continuous on S. Despite being defined on a finite 
dimensional space, the measure u is not necessarily equivalent to the Lebesgue measure 
on the whole of and so the previous definition of Onsager-Machlup functional does 
not apply. We hence must formulate a new definition for this case. 

Since p > 0 on int(5'), we can use the continuity of p to calculate the limits of ratios 
of small ball probabilities for v on int(S'). Let ai, 02 G int(S'), then 

HO SsH„^f>(a)Aa 

do 

^ P(ai) 

P(a2) 

= exp (logp(ai) - logp(a 2 )). 

If either Ui or 02 lie outside of S the limit can be seen to be 0 or 00 respectively. It 
hence makes sense to define the Onsager-Machlup functional for on A \ dS as 

f-logp(a) aGint(^) 

K{a) = < 

1^00 a ^ S. 

For a G dS, we define K{a) to be the limit of K from the interior: 

K{a) = — lim logp(6) a E dS 

b^a 

bGint(S) 

which is well defined due to the continuity of p on int(S'). K is then continuous on the 
whole of S. 

Remark 3.5. If we were to define K on dS in the same way that we defined it on A\dS, 
K would have a positive jump at the boundary related to the geometry of S. This would 
mean that K was not lower semi-continuous on S which would cause problems when 
seeking minimisers. The definition we have chosen is appropriate: if any minimising 
sequence (a„)n>i Q int(S') of K has an accumulation point on dS, then v has a mode 
at that point. 

If we have no prior knowledge about the interfaces and A is compact, we could 
place a uniform prior on the whole of A. Otherwise we could either choose a prior with 
smaller support, or one that weights certain areas more than others. 
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We assume that the priors on the fields and the geometric parameters are independent, 
so that we may take the product measure po x as our prior on X x A. Note that 
if F : X X A —7- L°°(D) denotes the construction map (u,a) defined earlier by 

( |2.1| ), then our prior permeability/conductivity distribution on L°°[D) is given by the 
pushforwarcjl] Po = x uq). This is much more cumbersome to deal with however, 

since for example L°°(D) is not separable. It is for this reason we incorporate the 
mapping F into the forward map Q. Assuming now that the prior /tq x vq is as described 
above, we can define the Onsager-Machlup functional for measures fi on X x A which 
are equivalent to po X no. 

Definition 3.6 (Onsager-Machlup functional II). Let jjL he a measure on X x A 
equivalent to fio x Pq, where fio and Pq satisfy the assumptions detailed above. Let B^{u, a) 
denote the ball of radius S centred at {u, a) G X x A. A functional I : X x A —>■ M 
called the Onsager-Machlup functional for fi if, 

(i) for each {u, a), {v, b) E E x int(S'), 

(a) for each {u,a) E E x dS, 

I{u,a) = lim I{u,b); 

b^a 

bGmt(S) 

(Hi) I{u, o) = oo for u ^ E or a ^ S. 

4. Likelihood and Posterior Distribution 

We return to the abstract setting mentioned in the introduction. Let X be a separable 
Banach space, A C and F = Suppose we have a forward operator Q : XxA ^Y. 
If (u, a) denotes the true input to our forward problem, we observe data y EY given by 

y = Q{u,a)Yr} 

where ~ Qo := A’(0,r), T G positive definite, is Gaussian noise on Y 

independent of the prior. 

It is clear that we have y\{u, a) ~ := N(Q{u, a), T). We can use this to formally 

find the distribution of {u,a)\y. First note that 

Qu,ai^y) = exp ^-4>(M,a;?/) -F ^\y\l^ Qoidy) 

where the potential (or negative log-likelihood) ^ : X x A x Y ^Mis given by 

^{u,a;y) = ^\g{u,a) - y\l. (4.1) 

§ Given a measurable map F : (X, X) ^ {YX) between two measurable spaces, the pushforward of a 
measure /i on X is the measure on Y defined by = /j.{F~^{A)) for A G y. If a random 

variable u on X has law /i, then the random variable F{u) on Y has law 
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Hence under suitable regularity conditions, Bayes’ theorem tells us that the distribution 
/i of (u,a)\y satisfies 


fi(du,da) oc exp ( — ^{u, a; y)^ fj,Q(du)uQ{da) 

after absorbing the exp (^|?/|r) term into the normalisation constant. 

We now make this statement rigorous. To keep the situation general, we do not 
insist that d> takes the form (4.1), and instead assert only that d> satisfies the following 
assumptions. 


Assumptions 4.1. There exists X' x A' Q X x A such that 
(i) for every £ > 0 there is an Mi{e) G M such that for all u E X' and all a E A' 

a; y) > Mi{e) - e\\u\\x-, 

(a) for each u E X' and y eY , the potential 'h(u, •; y) : A' ^ M is continuous; 

(Hi) there exists a strictly positive M 2 : M"*" x M+ x M+ M+ monotonic non-decreasing 
separately in each argument, such that for each r > 0, u E X' and a E A', and 
yi,y 2 E Y with \yi\, I 2 / 2 I < r, 

\^{u,a;yi) - <h(u,o; 7 / 2 )! < M 2 (r, ||u||x, \a\)\yi - ?/ 2 |; 

(iv) there exists a strictly positive M3 : M'*' x A x T — >■ M"*", continuous in its second 
component, such that for each r > 0, a E A' and y E Y, and ui,U 2 E X' with 
||mi||x, IIW 2 IU < r, 

\^{ui,a-,y) - ^{u 2 ,a-,y)\ < M^{r,a,y)\\ui - U 2 \\x- 


These assumptions are used in the proof of existence and well-posedness of the 
posterior distribution, which is given in the appendix: 


Theorem 4.2 (Existence and well-posedness). Let Assumptions f.l hold. Assume that 
(/io X Uq){X' X A') = 1, and that {/xq x uo){{X' x A') fl 5) > 0 for some bounded set 
B E X X A. Then 

(i) ^ is piQ X uq X QQ-measurable; 

(a) for each y eY , Z{y) given by 


z(y) = / ' 

JxxA 


exp(-$(u, a; y)) iXo{du)uQ{da) 


is positive and finite, and so the probability measure pp, 

lP{du,da) = —^exp(-$(u, a;?/))/ro(du)z/o(da) 

z{y) 

is well-defined. 

(Hi) Assume additionally that, for every fixed r > 0, there exists £ > 0 with 
exp(£||u|||)(l + M 2 (r, ||«||x,|o|)^) G L^^^^^iX x A;M). 

Then there is C{r) >0 such that for all y,y' eY with \y\, \y'\ < r, 
dHeii(A^, < C\y - y'\. 


(4.2) 
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Remark 4.3. In this paper we are focused on the case when the field prior fiQ is taken 
to be Gaussian. However, the above existence and well-posedness result still holds if, for 
example, fiQ is taken to be Besov rather than Gaussian, since a Fernique-type theorem 


holds for such priors '10, 2, 


We show that for both choices of test models, the potential (4.1) satisfies 
Assumptions |4.1 : 

Proposition 4.4. Let X = C^{D; and let Q : X x A ^ Y denote the forward map 
corresponding to either the groundwater flow or EIT problem, as detailed in section 
Let y & Y and let F e be positive definite. Define the potential 4>:AxAxF 

by 

^{u,a;y) = ^\0iu,a) -y\l. 

Then satisfies Assumptions |A with A' x A' = A x A. 

Proof. (i) 4> > 0 so this is true with Mi = 0. 


(ii) Fix n G A' and y EY. Propositions 2.6 and 2.7 tell us that Giu,-) is continuous for 
either choice of test model. The map z ^ \z — y|p is continuous, and so 4>(u, •; y) 
is continuous too. 


iii) A consequence of Propositions 2.6 and 2.7 is that for each u E X and a E A, 


G{u, a) can be bounded in terms of ||m||x and |a|. The result then follows from the 
local Lipschitz property of the map y \yf‘. 


(iv) Propositions 2.6 and 2.7 tell us that Gf,a) is locally Lipschitz for either choice of 
test model. The map z \z — r/|p is locally Lipschitz, and hence we conclude that 
4>(-, a; y) is locally Lipschitz, with Lipschitz constant independent of a. 

□ 

With a choice of prior as described in section we can therefore apply Theorem 
4.2| in the cases where the forward map is one of the two described in section and the 
observational noise is Gaussian. In this case, the constant M 2 (r, ||u||x, |q-|) appearing in 
Assumptions 4.1[ iii) is independent of ||u||x and |o|, and so the integrability condition 
(iii) in Theorem |4.2| always holds via Fernique’s theorem. The condition on positivity 
of a bounded set can be seen by taking, for example, B = 5^(0) x S, where S is the 
(compact) support of Uq. 


5. MAP Estimators 

In subsection |5.1| we characterise the MAP estimators for the posterior y in terms of the 


Onsager-Machlup functional for y. In subsection 5^ we relate this Onsager-Machlup 
functional to the Fomin derivative of y, with reference to the work (T^. 
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Throughout this section we assume that /i is given by (4.2). Furthermore we assume 
that fiQ has mean zero for simplicity. Additionally, when we assume that Assumptions 
|4.1| hold, we will assume that X' x A' = X x A. 

Suppressing the dependence of $ on the data y since it is not relevant in the sequel, 
we dehne the functional I : X x A ^ M by 


/(m, a) = $(«, a) + J(m) + A'(a) (5-1) 

where J, K are as dehned in subsections |3.2[ |3.3| respectively. In this section we attain 
the following three results concerning / and /r, which are proved in the appendix. 


Theorem 5.1. Let Assumptions f.l hold. Then the function I defined by (5.1) is the 
Onsager-Machlup functional for pi, where the Onsager-Machlup functional is as defined 
in Definition\3.^ 


Theorem 5.2. Let Assumptions 4-1 hold. Then there exists {u,a) E E x S such that 
I{u, d) = inf{/(u, a) ju E E, a E A}. 


Furthermore, if {un,an)n>i is a minimising sequence satisfying I{un,an) —?• I{u,a), then 
there is a subsequence {un^,an^)k>i converging to {u,d) (strongly) in E x S. 

Theorem 5.3. Let Assumptions \4 1\ hold. Assume also that there exists an M E M. 
such that d>(u, a) > M for any [u, a) E X x A. 

(i) Let — argmax/r(i?'^(u, o)). There is a {u, a) E E x S and a subsequence 

{u,a)eX X A 

of {u^, a^)5>o which converges to [u, a) strongly in X x A. 

(a) The limit {u,d) is a MAP estimator and minimiser of I. 


A consequence of Theorem is that, under its assumptions, MAP estimators 
and minimisers of the Onsager-Machlup functional are equivalent. The proof of this 
corollary is identical to that of Corollary 3.10 in 


Corollary 5.4. Under the conditions of Theorem 5.3 we have the following. 


(i) Any MAP estimator minimises the Onsager-Machlup functional I 

(a) Any {u*,a*) E ExS which minimises the Onsager-Machlup functional I is a MAP 
estimator for the measure pi given by (4-2). 


5.2. The Fomin Derivative Approach 

In recent work of Helin and Burger , the concept of MAP estimators was generalised 
to weak MAP (wMAP) estimators using the notion of Fomin differentiability of 
measures. The definition of wMAP estimators is such that if is a MAP estimator 
then it is a wMAP estimator, but not necessarily vice versa. Under certain assumptions, 
they show that wMAP estimators are equivalent to minimisers of a particular functional. 
The assumptions do not hold in our case, since our forward map is non-linear and our 
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prior /To X pq isn’t necessarily convex, however the fnnctional agrees with our objective 
functional /. Thus in what follows we provide a link between the Fomin derivative of 
the posterior fi and our objective functional I. 

The Fomin derivative of a measure on a Banach space X equipped with its Borel 
cr-algebra B{X) is defined as follows. 


Definition 5.5. A measure X on X is called Fomin differentiable along the vector z e X 
if, for every set A £ B{X), there exists a finite limit 


dA{A) 


lim ~ 

t —>-0 if 


The Radon-Nikodym density of d^X with respect to X is denoted fif, and is called the 
logarithmic derivative of X along z. 


Example 5.6. (i) Let uq be a measure on with Lebesgue density p, supported and 

continuously differentiable on S' C Then for any a G int(S) and b £ we 
have 

. b = dblogp{a). 

p[a) 

(a) Let pq be a Gaussian measure on a Banach space X with Cameron-MaHin space 
{E, (•, ^e)- Then for any u £ X and h £ E we have 


/dh°i^) = d)E. 


This follows from the Cameron-Martin and dominated convergence theorems. 

(in) Again using the Cameron-Martin and dominated convergence theorems, we see that 
with no and po as above, for any {u,a) £ X x int(S) and {h,b) £ E x 


We can use the above example to characterise the Fomin derivative of our posterior 
distribution p, given by (4.2). 


Theorem 5.7. Assume that $ : X x A —?• E is bounded measurable with uniformly 
bounded derivative, and assume that p is continuously differentiable on S. Then 
for each {u,a) £ X x int(S) and {h,b) £ E x E^, we have 


P(h,b)(^w) = -d^h,b)^{u,a) - {u,h)E + dblogp{a) 

= -d{^h,b)I{u,a) 

Therefore, {u, a) is a critical point of I if and only if (n, a) = 0 for all 
{h,b)£ExRC 


Proof. We use result (2.1.13) from [^, which tells us that if A is a measure 
differentiable along .2 and / is a bounded measurable function with uniformly 
bounded partial derivative d^f, then the measure / • A is differentiable along h as 
well and 


dfif-X) = dJ-X + f-d,X. 
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We apply this result with X = /iq x uq, f = exp(—$)/Z and = {u,a). Note that 
/ satisfies the assumptions of (2.1.13) due to the assumptions on d>. The result 
then follows using Example |5.6| (hi) above. □ 


6. Numerical Experiments 


In this section we perform some numerical experiments related to the theory above 
for a variety of geometric models, in the case of the groundwater flow forward map 
introduced in subsection 2.2, We both compute minimisers of the relevant Onsager- 


Machlup functional (i.e. MAP estimators), and we sample the posterior distribution 
using a state-of-the-art function space Metropolis-Hastings MCMC method. We then 
relate the samples to the MAP estimators. From these numerical experiments we observe 
the following behaviour of the posterior distribution. 


(i) The posterior distribution can be highly multi-modal, especially when the 
parameterised geometry is non-trivial. This is evident from the sensitivity of the 
minimisation of the objective functional on its initial state, and the behaviour of 
MCMC chains initiallised at these calculated minimisers. 

(ii) When the geometry is incorrect the fields attempt to compensate, which 
presumably contributes to the existence of multiple local minimisers of the objective 
functional; this occurs in both the MAP estimation and the MCMC samples. A 
consequence is that many of the local minimisers lack the desired sharp interfaces. 
These minimisers could however be used to suggest more appropriate geometric 
parameters for the initialisation. 

(hi) The mixing rates of MCMC chains have a strong dependence upon which local 
minimiser they are initialised at: acceptance rates can vary wildly when the initial 
state is changed but all other parameters are kept fixed. This provides some insight 
into the shape of the posterior distribution. 

(iv) Though often there are many local minimisers, they can be separated into classes of 
minimisers sharing similar characteristics, such as close geometry. MCMC chains 
typically tend to stay within these classes, which can be observed by monitoring 
the closest local minimiser to an MCMC chain’s state at each step. This suggests 
that the posterior can possess several clusters of nearby modes. 


One conclusion we can draw from the above points is that there are often many different 
geometries that are consistent with the data. This is not necessarily an effect of noise 
on the measurements, and the effect may persist as the noise level goes to zero, since it 
is unknown if these geometric parameters are uniquely identifiable in general. 


6.1. Test Models 

We consider three different geometric models: a two parameter, two layer model; a five 
parameter, three layer model with fault; and a five parameter channelised model. 
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Figure 7: The definition of the geometric parameters a = (a\ a^) in Model 1. 


In what follows, as in Example |3.41 we define the negative Laplacian with Neumann 
boundary conditions: 


A = -A, V(A) = e H^(E>) 

Recall that if u ~ ^(0,^4““) with a > d/2, then u is almost surely continuous [Id|. 


^ = 0 on dD, J u{x) dx = 0 . 


6.1.1. Model 1 (Two layer) This model is described in Example 


2.1 


The geometric 


parameters a = (a\a^) are defined as in Eigurej^ Eor simulations, we use the choice 
of prior 


iyo=U{[0,l])xU{[0,l]). 


6.1.2. Model 2 (Three layer with fault) This model is described in [^, where it is 

labelled Test Model 1. The geometric parameters o = {a^, , a^) are defined as 

in Eigurej^ with the fault occurring at x = 0.55. Eor simulations, we use the choice of 
prior 

/io = A(2,27l-^-^) X N{0,A-^-^) X A(-2,2A-^-^), 

no = U{S) X U{S) X N([-0.3,0.3]), 
where S C [0,1]^ is the simplex S = {{x,y) | 0 < x < 1, x < y < 1}. 

6.1.3. Model 3 (Channel) This model is described in [^, where it is labelled Test 

Model 2. The geometric parameters a = [of, of, of, , a^) are defined as in Eigure 

Here a}of, of represent the channel amplitude, frequency, angle, initial point and 
width respectively. For simulations, we use the choice of prior 
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in Model 3. 


no = ?7([0,1]) X t/([7r,47r]) x 17([-7r/4,7r/4]) x t/([0,1]) x t/([0,0.4]). 

For each model, we fix a true permeability o^) as a draw from the corresponding 
prior distribution, generated on a mesh of 256^ points. For the forward model, we take 
the coefficient map cr(-) = exp(-). We observe the pressure on a grid of 25 


uniformly spaced points, via the maps (2.4) with £ = 0.05. We add i.i.d. Gaussian noise 


A^( 0 , 7 ^) to each observation, taking 7 = 0.01. The resulting relative errors on the data 
can be seen in Table Small relative errors of this size typically make the posterior 
distribution hard to sample as they lead to measure concentration phenomena; MAP 
estimation can thus be particularly important. 


6.2. MAP Estimation 


Based on the theory in section we can calculate MAP estimators by minimising the 
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Model Number 

Mean relative error (%) 

Range of relative errors (%) 

1 

0.5 

0.02 - 3.5 

2 

0.9 

0 

1 

0 

3 

0.3 

0.1 - 1.0 


Table 1: The relative error on the data, when each measurement is perturbed by an 
instance of N{0, 0.01^) noise. 


Onsager-Machlup functional for the posterior distribution. We compute local minimisers 
of the Onsager-Machlup functional using the following iterative alternating method. 


Algorithm 6.1. 1. Choose an initial state {uQ,ao) £ X x A. 

2. Update the geometric parameters simultaneously using the Nelder-Mead algorithm. 

3. Update each field individually using a line-search in the direction provided by the 
Gauss-Newton algorithm. 

4 . Go to 2. 


The Nelder-Mead and Gauss-Newton algorithms are discussed in 30 , in sections 


9.5 and 10.3 respectively. Since we do not update the fields and geometric parameters 
simultaneously, it is possible that this algorithm will get caught in a saddle point: 
consider for example the function / : M x M E, f{x,y) = xy, at the point (0,0), 
being minimised alternately in the coordinate directions. Hence once the algorithm 
stalls, we propose a large number of random simultaneous updates in an attempt to 
find a lower functional value. If this is successful, we return to step 2 of the algorithm. 
We terminate the algorithm once the difference between successive values of $ is below 
TOL = 10“^. Calculations are performed on a mesh of 64^ points in order to avoid an 


inverse crime. 

To ensure that we explore the support of the posterior distribution, we choose a 
variety of initial states {uQ,ao)£ X x A for the minimisation such that I{uQ,ao) < 00 
in the continuum setting. To this end, we let oq be a draw from the prior distribution 
Uq, and take % fo he in the Cameron-Martin space of jiQ. Specifically, if a component 
oiu £ X has prior distribution N{m, A~°‘), we take the corresponding component of uq 
to be a draw from N{m, Output of the algorithm is shown in Figures 

We first comment on the minimisers of the Onsager-Machlup functional for Model 1. 
Generally the geometric parameters are closely recovered regardless of the initialisation 
state, though there is more variation in the fields. In the simulations where the 
geometry is inaccurate, for example simulations 7, 17 and 46, the fields can be seen 
to be compensating by forming a ‘soft’ interface where the true interface is. 

The minimisers associated with Model 2 admit much more variation, though it 
is possible to partition them into smaller subsets of minimisers which share similar 
characteristics to one another, as mentioned in point (iv) at the beginning of the section. 


10 


12 
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The clustering of the different minimisers is performed by eye, classifying them according 
to similar geometric parameters. Additionally we have an Other class, containing the 
minimisers which do not appear similar to one another nor appear to fit into any other 
class. We see later with MCMC simulations that these states do still act as local 
maximisers of the posterior probability. 

The minimisers of the Onsager-Machlup functional for Model 3 show even more 
variation than those for Model 2, with the geometry in half of the minimisers not even 
being close to the true geometry. In the cases where the geometry is drastically wrong 
the fields have again attempted to compensate. This behaviour is particularly evident 
in the Other class, and is echoed in the MCMC simulations later. The Other class here 
is much larger than for Model 2, though as with Model 2 these states do appear to act 
as distinct local maximisers of the posterior probability. 

This multi-modality of the posterior distribution is not unexpected. The paper 
considers the history matching problem in reservoir simulation, in which inference is 
done jointly on both geometric and permeability parameters in the IC fault model. 
Though the forward map and observation maps are different in our model, we observe 
the same clustering of nearby local MAP estimators, and increased multi-modality as the 
dimension of the parameter space increases. In it is observed that the global minimum 
often does not correspond to the truth, especially in the presence of measurement 
noise, and so all local minimisers of the Onsager-Machlup functional should be sought 
before drawing conclusions about the permeabilities - this appears to be the case in 
our model as well. We note that MCMC can be useful in identifying a range of such 
minimi sers, in view of the links established in the next subsection between MCMC and 
MAP estimation. 

6.3. MCMC and Local Minimisers 

We now observe the behaviour of MCMC chains initialised at these local minimisers 
of the Onsager-Machlup functional. We use a Metropolis-within-Gibbs algorithm for 
the sampling, alternating between preconditioned Crank-Nicolson (pCN) updates for 
the fields, see for details, and Random Walk Metropolis updates for the geometric 
parameters. Again, simulations are performed on a mesh of 64^ points in order to avoid 
an inverse crime. 10^ samples are taken for each chain, with the initial 2 x 10^ discarded 
as burn-in. The conditional means calculated from the samples are shown in Figures 

[TBlfT^ 

We monitor the value that $ takes along the chain and compare it with 

the value d* takes on the local minimisers (umaP) ®map)- This is shown in Figures [l6p^ 
with the horizontal lines being the different values of d*(uMAP) ®map)- Note that it makes 
no sense to monitor the value that the objective functional I takes along the chain as 
the fields almost surely do not lie in the corresponding Cameron-Martin spaces, and so 
I is almost surely infinite along the chain in the continuum setting. 

In addition, we monitor which minimiser the chain is nearest at each step, in the 
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permeability space. Specifically, we look at 

m„ := argmin - r(MMAp,<iMAp)lli>(D) 


(61) 


where F : X x A —?• L°°{D) is the construction map (2.1) from the state space to the 
permeability space. We make the choice of the norm over the L°° norm for the 

permeability space to avoid over-penalising incorrect geometry. A selection of traces of 
rUn are shown in Figures These illustrate that even though some of the local 


minimisers are very far from from the true log-permeability, they do indeed act as local 
maximisers of the posterior probability. Moreover, they show the interaction between 
the different classes of minimisers in the cases of Models 2 and 3. Specifically, they 
show that the MCMC chains can easily move within these classes, but moving between 
classes is more difficult. 

We now discuss the above monitored quantities, and their relation to MAP 
estimators, on a model-by-model basis. Despite the slight variation in the fields of the 
minimi sers from Model 1, the conditional means arising from the MCMC are nearly all 
identical. Simulation 23 stands out from the rest due to its slightly incorrect geometry 
- this effect can be seen in the trace plot of $, Figure [T^ where the value of $ remains 
larger than the simulations started elsewhere. The traces of $ for all other initialisations 
behave similarly to one another, taking similar misfit values after 2 x 10^ samples. From 
Figure [T^ it can be seen that the MCMC chains considered all spend a lot of time close 
to MAP estimator 38, despite this not being the estimator with the lowest functional 
value. 

For Model 2, typically the conditional means within the different classes are very 
similar to one another. Classes A and C resemble each other, and Class B has 
compensated for incorrect geometry with the centre field. Faults have developed in Class 
D, though there is still some compensation in the field. The centre field and a small 
fault has appeared in Class E, but again the fields are compensating. The geometric 
parameters for the permeabilities in the Other class remain relatively unchanged, but the 
fields have more freedom to attain a lower misfit than in the Onsager-Machlup functional 
minimisation due to the lack of regularisation term. Figure El shows evidence for a 
number of local minima with a large data misfit value $, with some chains appearing to 
remain stuck in their vicinity. The four chains visible in Figure [T^ (top) correspond to 
chains 49, 47, 45 and 43, from highest to lowest $ value, all lying in the Other class - 
despite their significantly incorrect geometry, the corresponding MAP estimators appear 
to be genuine local maximisers of the posterior probability. 

In the channelised model, Model 3, there is yet more variation between local 
minimisers. Here the compensation effect by the fields is even more apparent in the 
conditional means, especially in the Other class. From Figure it appears that the 
local minima are much sharper and more sparsely distributed than the previous two 
models. Again the chains with the largest d> values were initialised at minimisers in the 
Other class, suggesting the existence of many posterior modes with incorrect geometry. 

The mixing of the MCMC chains varies heavily based on the initialisation points 
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of the chains: with the same jump parameters for the held and geometric parameter 
proposals, acceptance rates vary largely based on which minimiser the chain was started 
from. This indicates that some of the minima are much sharper than others. This is 


also evident from the traces of dehned above, Figures [T9pT| especially in Model 3. 
Note also from these hgures that the nearest local minimum typically lies in the same 
class as the initialisation state, though jumps between classes are possible. Though not 
shown, in Model 2, whenever the initial state lies in Class A, then the nearest minimiser 
always lies in Class A. 


7. Conclusions and Future Work 

We have made a new contribution to the recently developed theory of MAP estimation 
in infinite dimensions (9,14 . We link MAP estimation to a variational Onsager-Machlup 
functional. The work is focused on priors for piecewise Gaussian random fields, with 
random interfaces parameterised finite-dimensionally. Such fields arise naturally in 
applications such as groundwater flow and EIT, and these are used to illustrate the 
theory and numerics. The work opens up several new avenues for investigation. A 


major theoretical direction is to fully reconcile the approaches in (^ and 14 ; the work 


in this paper suggests that this may be possible. On the applications side an important 
new direction would be to consider problems in which the geometric parameters are 
no longer independent from the fields a priori. A possible extension could be to treat 
the geometric parameters as hyperparameters for the fields under the prior. This would 
allow, for example, the fields to have specific boundary conditions at the interfaces, which 
may be more physically appropriate in some contexts. A related hierarchical model was 
considered in j^, in which prior samples were piecewise white; this could be extended 
to allow for spatial correlations in the continuum setting. Computationally an exciting 
direction is to build upon definitions of MAP estimators to develop hybrid algorithms 
which fully exploit local minimiser structure of the Onsager-Machlup functional within 
MCMC. 


8. Appendix 

In this appendix we provide proofs of the results given in the paper. 

8.1. Results From Section\B 

Before we prove Lemma [2.5| we require the following lemma. 

Lemma 8.1. Let (X,iF,fj,) be a measure space and f G L^(X,iF,ff). Let Bn Q J- be a 
sequence of measurable subsets of X with ii{Bn) —>-0 as n ^ oo. Then 

/ f{x) /r(dx) —)■ 0 as n ^ oo. 
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Proof. Write fn{x) = /(a:)lB^(x). We have that /„ —)■ 0 in measure: for any h > 0, 

e X I \fn{x)\ > 5}) < iJ,({x e X I \fn{x)\ 7^ 0}) < ^ 0. 

Now suppose that ||/n||Li does not tend to zero. Then there exists h > 0 and a 
subsequence {fnk)k>i such that ||/nj.||Li > S for all k > 1. This subsequence still 
converges to zero in measure, and so admits a further subsequence that converges to 
zero almost surely. We can bound this subsequence above uniformly by /, and so an 
application of the dominated convergence theorem leads to a contradiction. The result 
follows. □ 


Proof of Lemma |^.5[ Showing that 77 is well-defined is equivalent to showing that PDE 
has a unique solution for all (n, o) G Xx A. Since G L°°{D) it is bounded, and so 
by the continuity and positivity of a there exist ^max > 0 with Kmin < cr(n“) < Kmax- 
The associated bilinear form is hence bounded and coercive. The right hand side can 
be seen to lie in H~^{D) since G G H^{D) and a{u^) < Kmax, and so a unique solution 
exists by Lax-Milgram. 



(i) In its weak form, the PDE ( |2.3[ ) is given by 

[ a{u^)Xqu,a • V(/P = f{ip) - [ aVG ■ Xip for all cpeV. 
J D ’ J D 

Taking ip = we deduce that 

^min IIII Ci I 0'(u fXq^^a ' Xq^^a 


ID 


= fiQu,a) - / Cr{u°')XG • Xqu,a 


ID 


^ ||/||y* lk«,a||y + ||cr(““)|U°°||VG||L2||Vg„,a||L2 


and so we have the estimate 


Ibu,a||y < ||7u,a||v + ||f?||v 

< (||/||y* + \\o'{u^)\\l^\\G\\v)/ K-min{u, a) + IlGlIv- 

(ii) Let u,v & X and o G A. Then satisfies the PDE 

I -V • {a{u‘^)X{Pu,a - Pv,a)) = V • ((cr(«“) - a{v^))Xpv,a) m D 
\ Pu,a-Pv,a =0 ondD. 

Setting K^{u, v, a) = a) A cl)i we see 


K^{u,V,a)\\X{Pu,a 


P^,a)||i2 < / (yW\X{Pu,a-Pv,a)? 
JD 


< 


[ (^K) 

JD 

||ct(«“) - 


- Cr(n“))V(p„,a -Pv,a) ' '^Pv,a 

Cr(n“) IIloo II a -Pt;,a) ||l 2 II Vp„,a||L2 


and so by (i). 


\\Pu,a - Pv,a\\v < \\Pv,a\\v\W{u°') - Cr{v'')\\L°°/K*iu,V,a) 
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< ||cr(«“) - cr(t;“)||ioo 

X (dl/lk* + ||cr(«“)||Loo||G'||y)/fi:*(«,a)^ + ||G'||y/K4«,a)). 
Using that the Ai are disjoint gives that 

CT - a ('^VilAi(a) 


\a(u°') - (T(t;“)|Uoo = 


i=l 




LOO 


= \\cr{uk) - cr{vk)\\L<^ 

for some k = k(a). Now suppose that ||n||x, lblU < Then the property of a 
yields 

\\a{uk) - cr{vk)\\L°° < max|cr'(t)| • \\uk - nfc||L°° < max|cr'(t)| • \\u - v\\x. 

\t\<r \t\<r 

Finally we deal with the terms: 

K^(u,v,a)~^ = ( essinf A | essinf 

[\ xeD J \ xeD 

/ 

< minaft) A mincr(t) 

y|t|<r \t\^P 

-3 


1 -3 


= min a(t) 

We bound the ||(T(n“)||ioo term similarly. Putting the above bounds together, we 
have 

||7^(M,a) - n{v,a)\\v = \\Pu,a - Pv,a\\v 

< max ('mm(T(t)^ fl|/||v* + 11^11^ + 1 

J=1.2 \|t|<r / \ 

X max W(t) \ ■ lln — v\\x 

\t\<r 

= L{r)\\u - v\\x. 

Note that the constant L{r) is uniform in a. 

(iii) We use a similar approach to the previous part. Given u E X and a,b E A, the 
difference Pu,a — Pu,b satishes 

-X • {a{u^)V{Pu,a - Pu,b)) = V • ((cr(n“) - cr(n^))Vp„,fc) in D 
Pu,a - Pu,b =0 ondD 

which leads to the bound 


l^^{u,a,b)\\X{pu,a-Pu,b)\\h < / cr(““)|V(p„,a 


ID 


= / (c7(n“) - a{u’’))X{Pu,a - Pu,b) ■ Xpu,b 
JD 

< \\X{pu,a-Pu,h)\\LA\{^W - <^iX^))Xpu,b\\L^ 
where k,^{u, a, b) = Umi-aiu, a) A K,^in{Ui b). It follows that 

\\Pu,a -Pu,b\\v < ||(c^(w“) - cr{u''))Xpu,b\\L^/K,\{u,a,b). 
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Again by the disjointness of the Aj and the property of a, 


(cr(n“) - cr{u’’))Vpu,b\\L^ = 


N 


N 


a 


N 








V i=^ 


, i=l 


L2 


{cr {uilAi{a)) - {uilAi(b))) 


i=l 

N 


L2 


L2 


- XI II («ilAi(a)) - cr («ilAi(6))) Vp„,6| 

i=l 

N 

< max \a\t)\ ■ |||«ilAi(a) - UilAi(b)\'^Pu,b\ 

|q<||hti||oo 

1=1 

N 


L2 


< V] max \a\t)\ ■ ||Mj||oo \\'^Ai(a)AAiib)'^Pu,b\ 

- |^|<lki||oo 


L2 


i=l 


since \1a — ls| = '^aab- Now as before we can bound : 


K^{u,v,a) = 


essinfe“NA ) /\ ( essinfe"'^^) 

x^D J y x^D 


1 -1 


(. 


< ( min a(t) A min a(t) 

Y|t|<max ||ni||oo l^l^max ||ni||oo 

-1 


-1 


< min a(t) 

V|t|<ll«llx 

Putting the above bounds together, we have 
\\n{u,a) - n{u,b)\\v = \\Pu,a - Pu,b\\v 

-1 N 


<( min cr(t) ) y^ max \a'{t)\ ■ \\ui\\o^\\lAi{a)AAi(b)'^Pu,b\ 

Vl^l<ll'^llx / |q<||htz||oo 

^ ^ 1=1 


L2 


< I min 

vl^l<ll'^llx 


a{f) ) V] max |cr'(t)| • ||uj||oo ( / 

/ \JAi{a 


)AAi{b) 


|Vp^ 


The right hand goes to zero as each |Aj(o)AAj(6)| ^ 0 by Lemma 8.1 


1/2 


Since 


iVpu^bl G L‘^{D), and so the continuity of IZ{u,-) follows from the assumed 
continuity of the maps A*. 

□ 


Proof of Proposition jS. 7[ (i) Theorem 2.3 in 17 tells us that the mapping from the 

conductivity to the weak solution of (2.5) is Frechet differentiable with respect to 
the supremum norm, and hence locally Lipschitz. Note that the mapping from 
the solution to the boundary voltage measurements, (v, V) i-A V, is smooth, and 
the assumptions on a imply that it is Lipschitz. It hence suffices to show that the 
mapping u i-A F(u, a) is Lipschitz for each o G A. Let u,v & X and o G A, then 

N 

||F(u,a) - F(n,a)||oo < X “ Vi\\oc'^Ai(a) < C\\u - v\\x 

i=l 
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and the result follows. 


(ii) By Corollary 2.8 in [Cj and the continuity of cr, it suffices to show that o„ —>■ a in 
A implies that F(u, a^) —?• F{u, a) in measure. For any p G (1, oo) we have that 

N 

1 Ai(an)AAi(a) 

i=l 
N 


f \F{u,an)-F{u,a)\Pdx<y2 f 1 “*^^ 

J D -1 J D 


< E IKIISo ■ 

1=1 

From the assumed continuity of A(') h follows that F{u,an) —?• F{u,a) in IF for 
any p G (1, oo), and hence in measure. 


□ 


8.2. Results From Section\^ 


Proof of Theorem \4.^ (i) We first claim that the assumptions on d> mean that 

$(•,•;?/) : X' X A' ^ M is continuous for each y E Y. Fix y E Y and 
(u,a) E X' X A'. Choose any approximating sequence (un,an)n>i Q X' x X 
such that {um an) {u,a). Then the assumptions on the norm on X x A means 
that \\un — u\\x 0 and |a„ — o| —>■ 0. Letting r > max{||n||x, sup^ Hifnllx}, we 
may approximate 

\^{un,an;y) - ^{u,a;y)\ < |<F(«„,o„;p) - $(n,a„;p)| + |$(n,a„;p) - $(«,a;p)| 

< Ms{r, an,y)\\un - u\\x + |$(n,o„;?/) - $(n,o;i/)| 


< supM 3 (r,afc,p) • \\un - u\\x + \^(u,an;y) - ^(u,a-,y)\ 
\ken J 

where the supremum is finite due the continuity of M 3 in its second component. 
Since d> is also continuous in its second component, we see that the right-hand side 
tends to zero as («„, a„) —>■ («, a). 

Now as $(•, •; p) : X' X A' — 7 - E is continuous and (po x Po)iX' x A') = 1, $(•, •; y) 
is fiQ X z/Q-measurable. Setting Z = X' x A', we can consider ^ : Z x Y —>■ E. This 
is a Caratheodory function, and it is known that these are jointly measurable, see 
for example (^. We conclude that $ is po x z/q x Qo measurable. 

(ii) We first show Z(y) is finite. Since po is Gaussian, by Fernique’s theorem there 
exists tt > 0 such that 


exp(Q;||n||^) po{du) < 00 . 


IX 


Then using Assumptions |4.1[ i) , we have the lower bound 
$(n,o;|/) > Ml (a) — q;||«||^ 
from which we conclude that Z{y) < 00 . 

Now fix r > 0. Let y E Y and take (n,o) E X' x A' with max{||«||x, |a|} < r. 
Then we have by the local Lipschitz property 

\^{u,a;y)\ < M 3 {r,y)\\u\\x + |d>(0,a;?/)| < M 3 {r,a,y)r + \^{0,a;y)\. 
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Using the continuity of $ and M 3 in o, we can maximise the right hand side over 
|a| < r to deduce that 

\^{u,a;y)\ < K{r,y). 

Thus $(•, •;!/) is bounded on bounded sets. 

Now we can proceed as in (T^. Using that (/io x Uo){X' x A') = 1, we have that 


Z{y)= / exp(-$(n,a;?/))/ro(dn)z/o(da). 

Jx'xA' 


iX'xA' 

Set B' = (X' X A') n B, and set 

R = sup{max{||n||x, |q-|} | {u,a) e B'}. 
We deduce that 


sup ^{u,a;y) < K{R,y) < 00 


and so 


Z{y) > [ exp{-K{R,y)) iJ,o{du)uo{da) = exp{-K{R,y)){iio x uo){B') > 0. 

Jb' 

Hence the measure fp is well-defined. 

(in) The well-posedness of the posterior is proved in virtually the same way as Theorem 


4.5 in 10 


□ 


8.3. Results From Section\^ 

Throughout this section, for 5 > 0 and {u,a) G X x A, we will denote J^(u,a) = 
fi{B^{u,a)). To prove Theorems 5.1 and 5 . 2 , we first require two lemmas. 


Lemma 8 . 2 . Let {ui,ai), (u 2 , 02 ) £ E x int(S'). Then 

lim iLoX^o)iB^{uuai)) _ ||„,|||_i||„,||| _ pjai) 

5to (/io X Po){B^{u 2 ,a 2 )) p{a 2 ) 

= exp (J(u 2 ) + K{a 2 ) - J{ui) - K{ai)). 

Proof. We adapt the proof of Proposition 18.3 in to first show that 

(/io X no){B\ui, ai)) ~ x uo){B\0,ai)) as 5 i 0. 

The first half of the proof is almost identical to that in though some care must be 
taken since we cannot (a priori) separate the integrals over balls in X x A into products 
of those over balls in X and A. Using the Cameron-Martin theorem we see that 


{po X uo){B\ui,ai)) = e 




gUi,«>E fj,Q(^du)pQ{da). 


Since {ui, —u)e = —{ui,u)e and B^{0,ai) is symmetric about 0 G X, it follows that 


dui,u)i 


' B^0,ai) 


' Poi<iu)uo(da) = 

Jb 


Bp0,ai) 


d (gUiUfi _|_ g {ui,u)e'^ /io(dii)r'o(da) 
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> (yUo X i'o){B\0,ai)) 

which gives the inequality 

(/io X Po){B\ui,ai)) > x uq){B\ 0, ai)). (8.1) 

For the opposite bound, we write {ui, ■)e as the sum of two functionals Zc and on E. 
We aim to choose Zg to be continuous on E, and Zg ‘small’ in some sense. Then we have 
that 


(fiQ X Po){B\ui,ai)) = e 


s /io(du)r'o(da) 

/B'5(o,ai) 

1 


< exp ( + d • sup Zc{u) 

^ (u,a)eB^{0,ai) 


{fiQ X i'q)[B^( 0, ai)) + 


'BS{0,ai) 


_ 1) //o(du)z^o(da) 


where we have used the linearity of Zg to extract S from the supremum. As in 26 , using 
a result from [^, a special case of the Gaussian correlation conjecture, it follows that 
for any (7 G M and any convex set S C X symmetric about 0, 

fio{B n {u G X I |z 3 (m)| > C}) < iio{B)iJ,o{\zs{-)\ > C). 

Then for any increasing function ip : M+ ^ M+, one has 

(p(|z3(u)|)po(du)i^o(da) = / 


'B^0,ai) 


IXxA 
7 S ( 


(p{\zs{u)\)lBS(o,ai){u,a) /ro(du)z/o(do) 


= / (/«o X Uo){{{u,a) G 5'^(0, ai) | (p(|z3(u)|) > t})dt 


(po X iyo){{{u,a) G S^O, ai) | |zs(u)| > ip {t)})dt 


t^o{{u G X I (u, a) G B°{0, ai), |zs(m)| > (p i^o(dn)dt 


'0 JA 

POO 


< 


[ iio{{u e X \ {u,a) e B\0,ai)})iao{\zg{-)\ > ip ^(t)) uo{da)dt 
JA 


11 q{\zs{-)\ > ip ^(t)) ( / iio{{u e X \ {u,a) e B\0,ai)})uo{da) ) dt 


(lAoXuo)(B^(0,ai)) 


= (po X no){B\0,ai)) I /ro(|2;s(-)l > A ^(^))dt 


= {Po X Uo){B\0,ai)) / yUo((p(|z3(-)|) > t) dt 


= (po X i/o)(^'^(0,ai)) / (p{\zs{u)\) iao{du). 

Jx 
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L 


Choosing = exp(-) — 1 in this formula gives 

('gk«(«)l _ l) fj,Q[du)po{da) < (fio X uo){B\0,ai)) f _ i)^Q(dM). 

^■5(0,01) Jx 

The space of linear measurable functionals on E, which contains {ui,-)e, is the 
closure of E*. Thus for any e > 0, the functionals Zc, Zg can be chosen in order that the 
hrst of them is continuous and the second of them satisfies the inequality 


^Q(dM) < e. 


'X 


It follows that for each e > 0 we have 


(fiQ X Po){B\ui,ai)) 


< exp —- 


1 


% + S- sup Zc{u) ] {fio X UQ){B\0,ai)){l + e). (8.2) 

{u,a)&B^ (0,ai) 


Since balls are bounded, £ > 0 is arbitrary and z^ is continuous, we can combine (8.1) 
and ( |8.2| ) to deduce that there exists M > 0 such that 

e-^ll“illl(^Q X uo){B\0,ai)) < (/tq x uo){B\ui,ai)) < x uo){B\0,ai)). 


Now looking at the ratio of measures we see 

(/ip X uo){B^{ui,ai)) ^ ifio X uo){B^{0,ai)) ^ 

5t0 (po X l'o){B\u 2 ,a 2 )) NO (/To X Z/o)(S'5(0,02))' 

We now deal with the geometric parameters. Let a* e int(S') so that p is positive in a 
neighbourhood of a* (we may take a* = ai or 02 since we assume they lie in int(S')). 
Then 


(/ip X Po){B^{0,ar)) ^ fB^( 0 ,ad p(a)Mdu)da 
(po X 02 )) fBS( 0 ,a 2 ) p(a)po(du)da 

/nqo.a*) + ai- a*) po(du)da 

/fifio.a*) + «2 - a*) po(du)da 

^ Lqo.gq po(du)no(da) 

fBHo,a*) ^o(du)no(da) 

For sufficiently small S both of the integrands are continuous. A mean-value property 
hence holds for the integrals, and so we may divide both the numerator and denominator 
by (po X no)(B^(0, a*)) and take limits to obtain 


lim ^ >^o)(B'^(0,ai)) 
NO (/ip X r'p)(5'^(0, 02 )) 


p((2 -|- CLi — (2*) 


p{a) 

a=a* 

p(a + 02 — a*) 


p{a) 

a=a* 


p(Qi) 

P(«2)' 
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We conclude that 

{fiQ X no){B\ui,ai)) ^ _ p(ai) 

5to ino X no){B^{u2,a2)) ^(02) 

= exp {J{u2) + K{a2) — J{ui) — K{ai)). 


□ 


Lemma 8 . 3 . Let /, p : A M he continuous, and {ui,ai), (u2,a2) E E x int(S'). Then 

/B«(„2,a2)^'(®)/^o(^®)^o(do) p{a2) 9(02)' 

Proof. Let £ > 0 . Then by the continuity of / and g, and the assumption on the norm 
on W X A, there exists 5 > 0 such that 

{f{ai)-e){iaoXUo){B^{uuai)) ^ ^o(du)z/o(da) 

(9(02) + e)(go X i^o)(B^(u2,a2)) ~ 9(a) Po(du)no(da) 

^ (/(ai) + s)(9o X no)(B^(ui, ai)) 

“ (g(a 2 ) -£)( 9 o X no)(B^(u 2 ,a 2 ))' 

The result now follows by the previous lemma. □ 


Proof of Theorem B Let {u2,a2) E E x int(S'). The case d> = 0 is the result 

of Lemma 


.2, Now proceeding analagously to 9 


J\ui,ai) _ /B5(„i,ai)®^P(“^(“W))Ao(dM)z^o(da) 

J^{U 2 , 02 ) fB^n 2 ,a 2 ) exp(-^(u, a)) iao(du)uo(da) 

fs^inuai) exp(-d>(u, a) + $(«!, Ui)) exp(-d>(Mi, ai)) /Uo(du)no(da) 
lBHu2,a2) exp(-d>(u, a) + ^{u2,02)) exp(-d>(M2, ^2)) iao(du)uo(da)' 


Using Assumptions | 4 . 1 [iv), we have that for any (u,a), (v,b) E X x A, 


|d>(u,a) - d>(u, 5 )| < M3(r,a)\\u - v\\x + 


where r > max{||u||x, lbl|x}- Now set 


Li = max Msdluillx + 

|a|<|ai|+(^ 

L 2 = max M3 (||u2|U + ^, 0 ), 

|a| < |tl2 |+(^ 

which are finite due to the continuity assumption on M3. Then 

^ g(5(Li+L2)g-$(«i,ai)+$(«2,a2)) 

fj^(u2, U 2 ) 

“ $(ui,oi)|)/ro(du)i/o(do) 
lBHu 2 ,a 2 )^^'d(-\^(u 2 ,a) - ^(U 2 ,a 2 )\) tio(du)uQ{da)' 
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Note that both integrands are continuous in a, and so we may use the previous lemma. 
Taking lim sup^j^g of both sides gives 

54.0 J^{u2,a2) 

A similar method gives that the lim inf^j^o is bounded below by the RHS and so we have 
that for any (ui, 02 ), (u 2 ,Q- 2 ) & E x int(5'), 


J\ui,ai) 

hm —^7 -r 

54,0 U° (112,02) 


_ ^I{u2,a2)-I(ui,ai) 


Noting that I is continuous on £1 x S', we see that I agrees with the Onsager-Machlup 
functional on x S. Finally note that I{u,a) = cxo on (X \E) x A and E x {A\S). □ 


Remark 8.4. Note that the limit above is independent of the choice of norm used on 
the product space X x A when referring to the balls. If we use the norm given by 

||(x,a)|| = max{||a:||x, |a|} 

then we have that 


B\u,a) = B\u) X B\a) 


and so may deduce that, for any choice of norm on X x A, 


(po X nQ){B\ui,ai)) ^ (/tq x i'q){B\ui) x B\ai)) 
5t0 (po X l'Q){B\u2,a2)) (/To X 1'q){B\u 2) X B\a2)) 

_ tiQ{B^{ui)) z/o(R'^(ai)) 

“ lio{B^{u 2 ))' no{B^{02))' 

This will be useful later for separating integrals. 


Proof of Theorem 15-4 We follow the idea of the proof of Theorem 5.4 in [^, which is 
based on and (^, and first show / = 4> + J + Xis weakly lower semicontinuous 
on T X S. Let ^ {u,d) in E x S. Since S C weak convergence of 

the second component is equivalent to strong convergence. Since Hq{X) — 1 , T is 
compactly embedded in X and so Un ^ u strongly in X. In the proof of existence of 
the posterior distribution we showed that •F is continuous on X x A, and so we deduce 
that 4>(M„,a„) ^(u,a). Hence 4> is weakly continuous on T x S. The functional J is 

weakly lower semicontinuous on E and K is continuous on S, and so / is weakly lower 
semicontinuous on T x S'. 

Now we show / is coercive on T x S. Since E is compactly embedded in X there 
exists a C* > 0 such that 

Therefore by Assumption |4.1[ i) it follows that, for any £ > 0, there is an M(£) G M 
such that 









MAP Estimators for Piecewise Continuous Inversion 


36 


Since K is bounded belo^jilby — log ||p||oo, we may incorporate this into the constant 
term M{e): 

I(u,a)>M{e)+(^^-Ce^ ||n|||. 

By choosing e = 1/4(7, we see that there is an M G M such that, for all (u, a) ^ E x S, 
I{u, a) > ^||n|||; + M 
which establishes coercivity. 

Now take a minimising sequence (m„, a„) such that for any 5 > 0 there exists an 
Ni = Ni{d) such that 

M < I < I{un, an) < i + S, Vn > 

From the coercivity it can be seen that the sequence {un, a^) is bounded in F' x S'. Since 
ExS is a. closed subset of a Hilbert space, there exists {u,a) & E x S such that (possibly 
along a subsequence) {un,an) {u,a) in E x S. From the weak lower semicontinuity 
of I it follows that, for any 5 > 0, 

i < I{u,a) <I+S. 

Since 5 is arbitrary the first result follows. 

Now consider the subsequence («„,o„) ^ (^, 0 ). The convergence of ^ a is 
strong, so all that needs to be checked is that Un ^ u strongly in X. This follows from 
exactly the same argument as in the proof of Theorem 5.4 in (taking a as the second 
parameter in I and $) and so the second result follows. □ 


Before proving Theorem |5.3| we first collect some results on centred Gaussian 
measures from [^, specifically Lemmas 3.6, 3.7, and 3.9. For u E X, let 

Join) = iao(B\u)). 

Proposition 8.5. (i) Let 5 > 0 and u E X. Then we have 

JS(«) 


J?(o) - 

where c = exp (^5^) and oi is a constant independent of z and 5. 

(ii) Suppose that u ^ E, (m^) 5 >o Q X and converges weakly to u E X as h 4' 0. 
Then for any e > 0 there exists 5 small enough such that 


< ce-^(\\u\\x-s)^ 


jm 


< e. 


(Hi) Consider (u^)s>o ^ X and suppose that converges weakly and not strongly to 0 
in X as 5 4- 0. Then for any £ > 0 there exists 5 small enough such that 


Jo'iO) 


< s. 


II Recall in subsection 3.3 we assumed p to be continuous on the compact set S', and hence bounded. 
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Proof of Theorem \5.3[ (i) We first show {w,a°) is bounded in X x A. The 
boundedness of the second component is clear since S is bounded, so it suffices to 
show that (u^) is bounded in X. This is proved in the same way as in Theorem 

3.5 in |§. 

In the proof of existence of the posterior measure, Theorem |4.2[ we show that if 
r > 0 and ||u||x; hi < then there exists K{r) > 0 such that $(u,a) < K{r). 
Letting c = > 0, it follows in the same was as that, given any a E S, 

for h < 1 we have 


Joiu\a)>cJo\0,a). 

Suppose that (u^) is not bounded in X so that for any R > 0 there exists 5r such 
that > R, with 5r ^ 0 as R ^ oo. Then the above bonnd says that 

J-cf(0,a) ■ MB^{a)) - 

This contradicts Proposition |8.5[ i) above. Therefore there exists R,Sr > 0 snch 
that 


\\{u\a^)\\xxA < R for any S < Sr. 


Hence there exist {u,a) E X x A and a subsequence of (u^,a^)o^s<6R which 
converges weakly in X x A to {u,a) as h 4- 0. For simplicity of notation we 
still call this subseqnence {u^,a^). 

We now show that {u^,a^) converges strongly to an element of E x S. We first 
show that {u, a) E X x S. 

Note that any limit point of must he in S. Suppose it did not, and a limit point 
was a* ^ S. Then there exists > 0 such that along a subsequence converging to 
a*, 5 implies ^ S since S is closed. For 5 < ^dist(o*, S) A 5’^ we then have 
B^{af) n S' = 0. In particular UQ{B^{af)) = 0 for all such 5, which in turn implies 
J^{u,a^) — 0 for any u E X contradicting the definition of . It follows that we 
must have a E S. 

We need to show u E E. From the definition of {u^,af) and the bounds on d> we 
have for 5 small enongh and somber close to 1, 

_ K{l)-M Ao(d'») 

/Bqo)/^o(du) ■ 


We use Proposition 8.5[ ii). 
enough such that 


Supposing u ^ E, for any £ > 0 there exists 5 small 


/gq„6)Ao(dM) 
/sqo) /^o(du) 


% Remark 8.4 tells us that we can separate the integrals in the limit J ^ 0. 
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We may choose e = to deduce that there exists S small enough such 

that 


J‘(u\0) 1 

- J*(0.0) 2 


which is a contradiction, and so u E E. 

Knowing that {u,a) E E x S we now show that the convergence is strong. Any 
convergence of the second component will be strong and so we just need to show 
that —>■ h strongly in X. Suppose the convergence is not strong, then we may 
use Proposition |8.5[ iii) on the sequence — u. The same choice of £ as above 
leads to the same contradiction, and so we deduce that u ^ u strongly in X and 
the hrst result is proved. 


(ii) We now show that (u,o) is a MAP estimator and minimises I. As in [^, and the 

Assumptions |4.1[iii) to see that 


proof of Theorem 5.1 


we can use 


J\u\a^) SiLi+L2) -HC,C)+^iu,a)) 

J^{%a) - 

“ ^{u\a^)\) Po{du)uQ{da) 
- $(u,a)|)/ro(du)z/o(da) 


where 


Li = max MEllu^Wx + S.a), 

|a|<|ai|+'5 

L 2 = max ME\\u\\x + S,a). 


Therefore using the continuity of as shown in the proof of existence of the 
posterior distribution, and that {u^,a^) —?• (u,a) strongly in X x A, 


lim sup 

54,0 


J^{u,a) 


< lim sup 
54,0 


fB^(uEa^) Po(du)z^o(da) 
fB^(u,a) Mdu)iyo(da) 


Suppose is not bounded in E, or if it is, it only converges weakly (and not 
strongly) in E. Then HhUi; < liminf^^o and hence for small enough S, 

IIuIIb < II'u'^IIe. Therefore, since /tq is centered and ||m'^ — u\\x 0, \af — a| —?• 0, 


lim sup 

54,0 


/sh«,a) Lo(du)i^o(da) 


= lim sup 
( 54.0 


< lim sup 

( 54,0 


< lim sup 
54,0 


= lim sup 
54.0 


Lq„qLo(du)/g,(^,)t/o(da) 

Lo(du)/sqs) J^o(da) 

Lq„.)Lo(du) /nqgq ^o(da) 

Sb^{u) f^o{du) ™ 4 ,r^ (-) (da) 
/nqa^)^o(d«) 

IsH-a) ^o(da) 

|sqa«)| fs^a^) P(®) dn 
inqs)! fB^(d) P(®) dn 


= 1 . 
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The final equality above follows from the continuity of the integrand and the fact 
that \a^ — a| ^ 0: both the numerator and the denominator tend to p(a). 

Since by definition of > J^{u,a) and hence 

> 1 . 

5to J\u,a) 


this implies that 




(8.3) 


< 54-0 J^{u,a) 

In the case where {u^) converges strongly to u in E, we see from the proof of 
Lemma 18.21 that we have 

l-iWu^Wl-Msit^o X Uo)iB\0,a^)) (/io x uo){B\u\a^)) 


62 


(/io X Po){B^iO,a)) (/io X Po)iB^{u,a)) 

(/io X uo){B^{0,a^)) 
(/ioXi/o)(5^(0,a))' 

Since we have ^ u strongly in E we have in particular that 


1 “ llfi 

—7- 1 as 5 4. 0. Now using the continuity of p and 


It follows that e5 

the fact that |a'^ — a| —>■ 0, an argument similar to that in the proof of Lemma 
shows that 

{pqX UQ){B\Q,a^) 

iim --- ———- = 1. 

5to (^0 X no)(B^(0,a)) 

We therefore deduce that 

Mdu)iyo(da) ^ ^ 


and (8.3) follows again. Therefore (u,a) is a MAP estimator of the measure p. 
The proof that (fi, a) minimises / is identical to that in the proof of Theorem 3.5 
in [^. 

□ 
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Figure 10: (Model 1) the true log-permeability field (top), and 50 local minimisers arising 
from minimisation initialised at draws from a smoothed prior distribution. Simulation 
12 has the lowest functional value, with /(^maP) ®map) = 2847. 
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Figure 11: (Model 2) the true log-permeability field (top), and 50 local minimisers arising 
from minimisation initialised at draws from a smoothed prior distribution. Simulation 7 
has the lowest functional value, with 7(u^^p, = 2567. The minimisers have been 

divided into classes based on similar characteristics. 
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Figure 12: (Model 3) the true log-permeability field (top), and 50 local minimisers arising 
from minimisation initialised at draws from a smoothed prior distribution. Simulation 
20 has the lowest functional value, with /(m^ap) ®map) = 2117. The minimisers have 
been divided into classes based on similar characteristics. 
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Figure 13: (Model 1) the true log-permeability field (top), and the conditional mean 
arising from MCMC chains initialised at the corresponding local minimisers above. 
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Figure 14: (Model 2) the true log-permeability field (top), and the conditional mean 
arising from MCMC chains initialised at the corresponding local minimisers above. We 
group them into the same classes as the local minimisers. 
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Figure 15: (Model 3) the true log-permeability field (top), and the conditional mean 
arising from MCMC chains initialised at the corresponding local minimisers above. We 
group them into the same classes as the local minimisers. 
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Figure 16: (Model 1) The evolution of $ as the MCMC chains progress. The horizontal 
lines represent the value of each local minimiser under d>. Nearly all of the simulations 
find a small value of d> almost immediately, but simulation 23 remains caught in the 
local minimiser for some time before it follows. 
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Figure 17: (Model 2) The evolution of $ as the MCMC chains progress. The horizontal 
lines represent the value of each local minimiser under The majority of the simultions 
find a small value of $ almost immediately, but numerous fail to reach there, settling 
in local minima. The shape of these minima can be seen in Figure and generally 
correspond to those in the same class as the initial state. 
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Figure 18: (Model 3) The evolution of $ as the MCMC chains progress. The horizontal 
lines represent the value of each local minimiser under The majority of the simultions 
find a small value of $ almost immediately, but numerous fail to reach there, settling 
in local minima. The shape of these minima can be seen in Figure and generally 
correspond to those in the same class as the initial state. 













































MAP Estimators for Piecewise Continuous Inversion 


51 



Figure 19: (Model 1) The trace of as defined by (6.1), when the chain is initialised 
at a variety of minimisers - specifically numbers 1, 2,..., 8. 
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Figure 20: (Model 2) The trace of as defined by (6.1), when the chain is initialised 
at a variety of minimisers - specifically numbers 7,14,21,28,35,39,46 and 50. The 
different classes are alternately shaded. 
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Figure 21: (Model 3) The trace of as defined by (6.1), when the chain is initialised 
at a variety of minimisers - specifically numbers 7,13,21,33,38,47,48 and 49. The 
different classes are alternately shaded. 




















































































































































































































